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1 Introduction 


Many important problems in science and engineering require the computation of a small 
number of eigenvalues with algebraically largest (or smallest) real parts of a large nonsym- 
metric real matrix A. Among the typical examples from the literature, see e.g., [9], we only 
mention the important class of stability analysis and more generally of bifurcation prob- 
lems [17], from which we will draw our main test example. From the numerical point of 
view, nonsymmetric eigenvalue problems can be substancially more difficult to solve than 
the symmetric ones. This is due to the fact that eigenvalues of large matrices can be arbi- 
trarily poorly conditioned. In this paper we will propose a few techniques and tools that 
can be combined with the traditional projection methods to enhance their efficiency and 
robustness. 

There have been mainly three basic projection methods for solving large nonsymmetric 
eigenvalue problems investigated so far. The first is Bauer’s subspace iteration method and 
its many variations [2,7,16,15,41,42,45]. An important drawback of this method, recognized 
both in the symmetric case [23,21], and the nonsymmetric case [34,29] is that it may be 
exceedingly slow to converge. Another known weakness is that it computes the dominant 
eigenvalues of A, i.e., those having largest modulii, whereas in many important applications 
it is the eigenvalues with largest real parts that are wanted. However, this difficulty can be 
obviated by using Chebyshev acceleration as is suggested in [29]. The second method is due 
to Arnoldi [1,34] and is essentially an orthogonal projection method on the Krylov subspace 
{vi, Av i, . . . A m-l ui}. Thus, Arnoldi’s method is a generalization of the symmetric Lanczos 
algorithm. Its main drawback is that, unlike the symmetric Lanczos algorithm, the growth of 
computational time and storage becomes excessively high as the number of steps increases. 
Variations on the basic scheme have been proposed [34], which lead to oblique projection 
type techniques [30], but their theory is not well understood. Finally, the third method 
is the nonsymmetric Lanczos method [8,19,25,22,43] which is another generalization of the 
symmetric Lanczos algorithm due originally to Lanczos. It produces a tridiagonal matrix 
some eigenvalues of which can be taken as approximations to the eigenvalues of A. At the 
difference with Arnoldi’s method, this is not an orthogonal projection method, but an oblique 
projection method [30]. The algorithm runs the risk of breaking down at any step, which 
has given a poor reputation to the method in the past [45]. Parlett, Taylor and Liu [22] have 
suggested an elegant solution to this problem. Cullum and Willoughby [8] have extended 
their symmetric Lanczos algorithm without reorthogonalization, to the nonsymmetric case 
and suggest a new way for dealing with the resulting non-hermitian tridiagonal matrices. 
On the whole the main difficulty with the nonsymmetric Lanczos method is theoretical, as 
the method is not too well understood. 

To these three basic methods one should add a number of tools such as deflation processes 
and acceleration/ preconditioning techniques the best example of which is the so-called shift- 
and-invert strategy. This paper is not concerned with the projection methods enumerated 
above but rather with the implementation of these secondary tools. It shows their importance 
and proposes ways to put together efficient methods that exploits them in conjunction with 
the projection techniques. More specifically we describe: 
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• A particular case of Wielandt deflation which is of interest when computing Schur 
vectors. We refer to this as Schur-Wielandt deflation. 

• A shift-and-invert strategy for general nonhermitian matrices. 

• A polynomial preconditioning technique consisting of iterating with the matrix p(A), 
where p is a carefully chosen low degree polynomial. 

These techniques can be combined with any of the three projection methods discussed 
above but we will illustrate their implementation only with Arnoldi’s method for the sake of 
brievety. 

Before proceeding further we would like to point out a few key differences between the 
three projection methods. First, we emphasize that, in practice, subspace iteration is only 
able to compute a small number of eigenvalues and associated eigenvectors of a large nonsym- 
metric matrix. To some extent Amoldi’s method presents the same limitations in practice. 
The nonsymmetric Lanczos algorithm without reorthogonalization or with some form of 
partial reorthogonalization is the only method that has the potential of computing a large 
number of eigenvalues and eigenvectors of a nonsymmetric matrix A [8,25]. On the other 
hand the Lanczos algorithm requires the use of both the matrix A and of its transpose. In 
some applications the matrix A is not available explicitly but the action of multiplying A 
by a vector is easy to perform, by use of a finite difference formula. In those cases A 7 is not 
available and cannot even be approximated with finite differencing. For example one may be 
interested in studying the stability of a dynamical system governed by a partial differential 
equation of the form 

I = '<“•*> M 

where F is a partial differential operator, and 6 some real parameter, as 6 varies. Such a 
system is said to be stable if all the eigenvalues of the Jacobian of F with respect to u, 
computed at the steady state solution, have negative real parts. All that may be wanted 
here is to compute one eigenvalue or a complex pair of eigenvalues. Although the Jacobian 
matrix J at some coordinate u may not be available explicitly, the multiplication of J by 
an arbitrary vector x can be carried out, usually at low cost, with the help of the difference 
formula 

F(u + cs, 6) — F(u, 6) 


Jx fa 


( 2 ) 


where e is some small and carefully chosen scalar. 

The approximation (2) has been useful in solving nonlinear systems of equations [3,6,12,4, 
44,18] and to compute eigenvalues of various semi-discrete operators [11] used in compressible 
fluid flow calculations. Here, an algorithm such as Arnoldi’s method can be used but not 
the nonsymmetric Lanczos procedure since we do not know to compute the vector J T x for 
any vector x when the Jacobian matrix J is not explicitly available. 

In the numerical experiments section we illustrate these techniques with an example 
issued from a well-known and simple bifurcation model from Chemical engineering. Problems 
of this type are numerous in structural engineering [5], in aerodynamics (the panel flutter 
problem [37]), chemical engineering [14], fluid mechanics [10] and many other fields. 
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2 A Schur-Wielandt deflation technique 

When used with caution, deflation procedures can be quite useful and effective if a small 
number of eigenvalues and eigenvectors are to be computed. In the nonsymmetric case many 
common deflation techniques require the knowledge of both right and left eigenvectors. These 
procedures, an example of which is Hotelling’s deflation, can be ill-conditioned if only because 
the determination of eigenvectors of a general sparse matrix can be itself untrustworthy. In 
fact in the defective case there may not exist a basis of the invariant subspace consisting of 
eigenvectors and therefore any numerical method that attempts to determine such a basis 
will have numerical difficulties. As suggested by Stewart [42] it is preferable to work with 
Schur vectors, i.e. with an orthonormal basis of the invariant subspace, when dealing with 
the nonsymmetric eigenvalue problem. A partial Schur factorization is of the form 

AU = UR (3) 

where U is an N x p complex orthogonal matrix (U H U = I) and R is upper triangular 
complex matrix. Here, X H denotes the transpose of the complex conjugate of a matrix X. 
Note that the order of the eigenvalues Ai, A 2 , .. . A p as they appear in the upper triangular 
matrix R is crucial. In fact, when the eigenvalues Ai,A 3 ,..A p are distinct, then for a given 
order this factorization is unique in the usual sense of QR factorizations, i.e. the columns 
of U are uniquely determined up to a sign of the form e* 9 . Thus, whenever we choose a 
certain ordering of the eigenvalues, we can deal with the Schur vectors without confusion in 
the same way that we deal with the eigenvectors of a Hermitian matrix. We will consider 
later the problem of avoiding complex arithmetic when the matrix A is real. 

In this section we describe a deflation technique which is a simple variation of Wielandt’s 
deflation and show how it can be put to work to compute an orthonormal basis of an invariant 
subspace and the corresponding partial Schur forms. We start our discussion with the general 
Wielandt deflation with one vector and explain why Schur-Wielandt deflation is often nearly 
optimal. In the following we denote by ||.||2 the 2-norm in C N . Unless otherwise stated the 
eigenvalues are ordered in decreasing order of their real parts (if a conjugate pair occur then 
the one with positive imaginary part is first). All eigenvectors are assumed to be normalized 
by their Euclidean norms. 

2.1 Wielandt deflation with one vector 

Suppose that we have computed the eigenvalue Ai of algebraically largest real part and a 
corresponding eigenvector «i of a matrix A by some basic algorithm, say algorithm (A), which 
delivers the eigenvalue of largest real part of the input matrix, along with an eigenvector. 
For example, if the matrix A is known to have real eigenvalues, algorithm (A) can be some 
variant of the power method. It is assumed in what follows that the vector Uj is normalized 
so that ||ui ||2 = 1. The problem is to compute the next eigenvalue A 2 of A. An old technique 
for achieving this is what is commonly called a deflation procedure: a rank one modification 
of the original matrix is performed so as to displace the eigenvalue Ai, while keeping all 
other eigenvalues unchanged. The rank one modification is chosen so that the eigenvalue A 2 
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becomes the one with largest real part of the modified matrix and therefore, Algorithm (A) 
can again be applied to the new matrix to retrieve the pair A 2 , « 2 . 

In the general procedure known as Wielandt’s deflation only the knowledge of the right 
eigenvector is required. The deflated matrix is of the form 

Ai = A - <tu\v h (4) 

where v is an arbitrary vector such that v H u\ = 1, and <7 is an appropriate shift. As is 
well-known the eigenvalues of Ai are the same as those of A except for the eigenvalue Ai 
which is transformed into the eigenvalue A 1 — <7. 

Theorem 2.1 (Wielandt): The spectrum of A t is 

= {Ai — <7, A 2 ,A 3 , ..., A p } (5) 

Moreover, the left eigenvectors of A associated with A 2 , A3, . . . , A/y are preserved under the 
deflation process, and so is the right eigenvector u 2 . 


It is important to determine what the right eigenvectors become when i ^ 1. For each i, 
we will seek a right eigenvector of A\ in the form of u,- = u« — 7i«i- We have 

AiUi = (A - cruiv H )(ui - 7,-tii) = A iUi - [7.A1 + <r(«„ v) - <77,]ui (6) 


When 2 = 1, taking 71 = 0 shows again that any eigenvector associated with the eigenvalue 
A x remains an eigenvector of A\, associated with the eigenvalue A*— <7. For i 1, it is possible 
to choose 7,- so that the vector u, is an eigenvector of A\ associated with the eigenvalue A,-: 


<f (»i, v) 

7i tr — (A, - AO 


m 


Observe that the above expression is not defined when the denominator vanishes, but it is 
then known that the eigenvalue A, = A x — <7 is already an eigenvalue of Ai, i.e., the eigenvalue 
A x — <7 becomes multiple, and we only have one eigenvector namely u x . 

There are infinitely many different ways of choosing the vector v which can be taken to 
be any vector in the affine subspace of dimension N — 1 of vectors whose inner product with 
the vector is equal to one. A common choice is to take v = W\ the left eigenvector. This 
is referred to as Hotelling’s deflation and has the advantage of preserving both the left and 
right eigenvectors of A as is seen from the fact 7, = 0 in this situation. 

An interesting question that remains to be answered is the following: among all the 
choices of v, which one(s) will provide the best possible condition number for the next eigen- 
value A 2 to be computed ? The condition number of an eigenvalue is defined as the inverse of 
the cosine of the angle between its corresponding right and left eigenvectors. It is a measure 
of the sensitivity of the eigenvalue to perturbations in A. Thus, a large condition number, 
i.e., a poorly conditioned eigenvalue, will cause difficulties to the numerical algorithm that 
is used to compute that eigenvalue. This consitutes the motivation for the above question. 


\ 
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We will distinguish the eigenvalues and eigenvectors associated with the matrix A x from 
those of A by denoting them with a tilde. The condition number of the next eigenvalue \ 2 
to be computed is, by definition [45,13], 


Cond( A 2 ) = 




( 8 ) 


where u 2 , W2 are the right and left eigenvectors of A x associated with the eigenvalue A 2 . From 
what we have seen earlier, we know that w? = wj while u 2 — u 2 — 72 «i where 72 is given by 
(7). Assuming that ||u? 2 ||2 = 1 we get, 


Cond(X 2 ) = 


||U2 ~72«l||2 


(9) 


where we have used the fact that (uj, = 0. It is then clear from (9) that the condition 
number of A 2 is minimized when 72 = (u 2 , Ux) = cos0(u 2 , tti). Let us define z — u 2 — 
co$0{u 2 , ui)ui, the vector obtained by orthogonalizing U2 against ui. Substituting this result 
in (7) we obtain 

(z + cos(e(u 2 , Ui))u l5 u) = ( 10 ) 

which yields the condition, 


(z,v) = — (A x - A 2 ) cos0(ux,u 2 ) 

( 7 


(il) 


to which we must add the normalizing constraint, 


= 1 


( 12 ) 


There are still infinitely many vectors v that satisfy the above two conditions in general. 
Condition (11) may seem impractical because it uses the knowledge of the right eigenpair 
(A 2 , u 2 ) which we are precisely trying to compute but one can think of an adaptive procedure 
in which the deflation is adjusted as the iteration process delivers better approximations of 
(A 2 , u 2 ). Moreover, we are interested from the theoretical point of view, in analyzing how far 
from optimal are the common choices of v. The conditions (11) and (12) allow us to select 
v in a linear space of dimension 2. We will consider two important choices: 

(1) Choose v in the linear span of ui and u 2 ; 


(2) Choose v in the linear span of ui and w\. 

Consider (1) first. In this case we find that the optimal v is 

v op t = «i - -(Ax - A 2 ) cotan9{u 2 ,u x )z 

in which z = zf\\z\\ 2 . We also find that 

Cond(X 2 ) = Cond (\ 2 ) szn0(it2,Ui) 


(13) 


(14) 
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Not surprisingly, we note that when 9 is close to jt/2 , the choice v = ui is nearly optimal. 
This is the situation of the hermitian case. Moreover, when (A 2 — Ai) is small with respect 
to cr and the angle 0(u 2 , «i) is not too small, then the choice v = u\ is again nearly optimal. 
This particular choice has an interesting additional property: it preserves the Schur vectors. 

Proposition 2.1 Let u\ be an eigenvector of A of norm 1, associated with the eigenvalue 
Aj and let Ai = A — cruiu^ . Then the eigenvalues of Ai are At = Xi — a and A j = A j,j = 
2,3 Moreover, the Schur vectors associated with \j,j = 1, 2, 3 ..., N are identical with 
those of A. 

Proof Let AU = UR be the Schur factorization of A, where R is upper triangular and 
U is orthonormal. Then we have 


A\U = [A — <tu\Ui]U = UR — <ruie^ = U[R — creief*] (15) 

The result follows immediately. □ 


We now turn to the second way of choosing v. In this case we can express v as 
v = aui + ftwi and the optimality condition substituted in (7) yields, 



(U 2 ,QU! +/0U>i) 

1— (A, —A,)/, = 

(16) 

Because w\ is orthogonal to tt 2 

we immediately get 


; O 

Aj — A 2 

a = 1 

a 

(17) 

and from the constraint (12), 
yielding the optimal vector 

fim-L- (1-a) 
(wi,«l) 

(18) 

z 82 x ^2 

V ^ = ( l + 

(19) 


where we have set for convenience <5 2 = Aj — A 2 . Moreover, the new condition number 
Cond( A 2 ) is also given by (14). It is revealed by (19) that when the first eigenvector is well 
conditioned then since 6? is usually much smaller than <r, will not be too different from 
Uj . On the other hand if the first eigenvalue is very ill conditioned, the best v is closer to 
t£>i. An interesting difference with the previous case is that everything here is computable 
except S 2 which can be easily estimated dynamically. Apart from this practical difference, 
the two methods will provide the same condition number, i.e., they are, from the qualitative 
point of view, identical. However, the second method requires computing both the right and 
left eigenvectors whereas the first only requires the right eigenvector. 
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2.2 Deflation with a block of vectors 

Let ui, u 2 i • • • u j be a set of Schur vectors associated with the eigenvalues A x , A 2 , . . . A j. We 
denote by Uj the matrix of column vectors ui, u 2 , . . . Uj. Thus, 

Uj 3 [ui,ti 2 ,...,Wj] (20) 

is an orthonormal matrix whose columns form a basis of the eigenspace associated with 
the eigenvalues A x , A 2 , . . . A j. We do not assume here that these eigenvalues are real, so the 
matrix Uj may be complex. An immediate generalization of Proposition (2.1) is as follows. 

Proposition 2.2 Let E j be the p x p diagonal matrix E j = Diag{<T\,<Ti, . . .<Tj) . Then the 
eigenvalues of the matrix 

Aj = A- VfrUf, (21) 

are AJ = Aj — <Tj for i < j and AJ = Aj for i>j. Moreover, its associated Schur vectors are 

identical with those of A. 

Proof Let AU = U be the Schur factorization of A. Thus, the matrix of the first j 
columns of U being equal to Uj. We have 

AjU = {A- UfLjUf}U — UR — UjVjEf, (22) 

where Ej = [ei, e 2 , . . . ef\. Hence 

AjU = U[R - EjLEf] (23) 

and the result follows. □ 

2.3 Explicit Schur- Wielandt deflation in practice 

It is interesting to note that the preservation of the Schur vectors is analogous to the preser- 
vation of the eigenvectors under Hotelling’s deflation in the hermitian case. The above results 
suggest a very simple incremental deflation procedure consisting of building the matrix Uj 
one column at a time. Thus, at the j-th. step, once the eigenvector Uj+i of Aj is computed 
by the appropriate algorithm (A) we can orthonormalize it against all previous ttj’s to get 
the next Schur vector u J+1 which will be appended to Uj to form the new deflation matrix 
Uj+i. It is a simple exercise to show that the vector Uj+i thus computed is a Schur vector 
associated with the eigenvalue Aj +X and therefore at every stage of the process we have the 
desired decomposition 

AUj = UjRj , (24) 

where Rj is some j x j upper triangular matrix. 

More precisely we may consider the following algorithm, in which the successive shifts 
are chosen so that for example <7j = Aj. 

Algorithm 

Do i = 0, 1,2, ..., j — 1: 
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1. Define A,- = A,_ x — o , ,_ 1 u,_xU^. 1 (initially define A 0 = A) and compute the dominant 
eigenvalue A,- of A,- and the corresponding eigenvector u,-. 

2. Orthonormalize u, against u x , u 2 , u,_ x to get the vector 

From the practical point of view there are a few important details that make the deflation 
procedure more effective. We first address the question of avoiding complex arithmetic when 
the matrix A is real. With the above implementation, we may have to perform most of 
the computation in complex arithmetic even when A is real. Fortunately, when the matrix 
A is real, this can be avoided. In this case the Schur form is traditionally replaced by the 
quasi-Schur form, in which one still seeks the factorization (11) but simply requires that the 
matrix Rj, be quasi- triangular, i.e. one allows for 2 x 2 diagonal blocks. In practice, if \j +1 is 
complex, most algorithms do not compute the complex eigenvector yj+ 1 directly but rather 
deliver its real and imaginary parts yn , yi separately. Thus the two eigenvectors y R ± iyi 
associated with the complex pair of conjugate eigenvalues A J+x , Aj+j = A J+1 are obtained at 
once. 

Thinking in terms of bases of the invariant subspace instead of eigenvectors, we note 
that the real and imaginary parts of the eigenvector, generate the same subspace as the 
two conjugate eigenvectors and therefore there is no point in working with the (complex) 
eigenvectors instead of these two real vectors. Hence if a complex pair occurs, we only have 
to orthogonalize the two vectors yn, yi against all previous u,-’s and pursue the algorithm in 
the same way. The only difference is that the size of Uj increases by two instead of just one 
in these instances. 

Another practical consideration, is that one never needs to form A x explicitly. This is 
important because in general this matrix is full. If the algorithm we are using requires 
only matrix by vector multiplications, then clearly an operation of the form y — A\x = 
(A — <7 uv H )x can be performed as follows 

a. Form y := Ax and the scalar t — a (x, v) 

b. Compute y y — t u x 

This procedure only requires that the vectors u x , and v be kept in memory along with 
the matrix A. It is possible to deflate A x again into A 2 , and then into A 3 etc. At each step 
of the process we have 

A,- = A,_i - auivj 1 (25) 

Here one needs only to save the vectors u, and u,- along with the matrix A. However, one 
should be careful about the usage of deflation in general. It should not be used to compute 
more than a few eigenvalues and vectors. This is especially true in the non hermitian case 
because of the fact that the matrix A* will accumulate errors from all previous computations 
and this could be disastrous if the eigenvalue comput ed is poorly conditioned. A careful 
analysis of these accumulation of errors is proposed in the Appendix, where a computable 
upper bound for the error on the computed invariant subspace is proposed. Another serious 
difficulty with deflating with a large number of vectors is the high computational cost. 

We now give a sketch of the Schur- Wielandt deflation procedure for computing the p 
eigenvalues of largest real parts of a matrix A. 
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Algorithm: Explicit Schur-Wielandt Deflation (ESWD) 

1. Initialize: 

j := 0, Uq := {0}, Eo := 0. 

2. Compute next eigenvector (s) : 

Call algorithm (A) to compute the eigenvalue A J+1 (resp. the conjugate pair of eigenval- 
ues A J+ i, Aj +2 = Aj + i) of largest real part of the matrix Aj = A — UjUjUj*, along with 
an eigenvector y (resp. the real part and imaginary part yn, yi of the complex pair of 
eigenvectors). Choose the next shift <Jj+i, and define E J+1 := Diag {< 7 i,<t 2 , . . .<r ;+ i}. 

3. Orthonormalize: 

Orthonormalize the vector y (resp. the vectors yR,yi) against the vectors iti, t^, ...Uj, 
to get u j+ 1 , (resp. uj+ lt u j+2 ). 

Set C/j+i := [Uj, u i+1 ], j := j + 1, (resp. U j+J := [Uj, Uj+i,u j+2 ], j := j + 2.) 

4. Test: 

If j<p goto 2, else set p := j, compute R p := U p AU p and exit. 


We point out that the above algorithm has as a parameter the algorithm (A) , which 
delivers the eigenvalue(s) with largest real part(s) with its (their) associated eigenvector(s). 
The shift <7j +1 in step 2, is chosen so that the eigenvalues Aj, A 2 , . . . A p will in turn be the ones 
with largest real parts during the algorithm. There is much freedom in choosing the shift but 
it is clear that if it is too large then a poor performance in step 2 of the algorithm will result. 
Ideally, we might consider choosing a so the real part of the eigenvalue just computed, i.e. 
Aj coincides with that of the last eigenvalue Ajv- This yields <r J+ i = Re(Xj — A^). Clearly, 
this value is not available beforehand but it suffices to have a rough estimate. Practically, 
we found it convenient and not restrictive in any way to take all shifts equal to some equal 
value <r determined at the very first step j — 1. The matrix Ej then becomes <rl. 

For step 2, we will give more detail in the next sections on how to compute the eigenvectors 
y or the pair of conjugate eigenvectors yn ± iyi. A crucial point here is that the matrix Aj 
is never formed explicitly, since this would fill the matrix and is highly ineffective. Clearly, 
if p is large the computational time of each matrix by vector multiplication becomes very 
expensive. Another potential difficulty which we consider in detail later is the building up 
of rounding errors. 

In step 3, several possibilities of implementation exist. The simplest one which we have 
adopted in our codes consists in a modified Gram Schmidt algorithm which allows for up 
to two reorthogonalizations depending on level of cancellation. Another more expensive 
method of orthogonalizing a set of vectors which is somewhat more robust is the Householder 
algorithm. 
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Before exiting in step 4, the upper triangular matrix R p is computed. For brevity we 
have omitted to say in the algorithm that one need only compute the upper quasi- triangular 
part since it is known in theory that the lower part is zero. Note, that the presence of 
2x2 diagonal blocks requires a particular treatment. Alternatively, we may compute all 
the elements of the upper Hessenberg part of Rp, at a slightly higher cost. However, as will 
be seen in the appendix this is not necessarily the best choice. In the presence of round- 
off, the matrix R p = 1/ff AU P is slightly different from the Schur matrix, and computing 
its eigenvalues correponds to applying a Galerkin process onto the subspace spanned by 
the block U p . The appendix deals in detail with the error analysis of the Schur Wielandt 
deflation. 


2.4 Implicit deflation procedures 

In many instances explicit deflation can be replaced by a procedure that blends more nat- 
urally with the structure of the projection method such as Amoldi or subspace iteration. 
The simplest illustration of this technique is with Arnoldi’s algorithm the standard version 
of which is outlined next. 


Algorithm: Amoldi 

1. Initialize: 

Choose an initial vector ui of norm unity. 

2. Iterate: Do j = 1,2, ...,m 

1. Compute w := Avj 

2. Compute a set of j coefficients so that 

j 

w := w — ^2 hijvi (26) 

ts*X 

is orthogonal to all previous w,’s. 

3. Compute hj+ \j = ||io ||2 and u,-+i = w/hj + i t j. 

The above algorithm produces an orthonormal basis of the Krylov subspace 

K m = spon{ui, Avi , . . . , A”* _1 Ui}. 

The m x m upper Hessenberg matrix H m consisting of the coefficients hjj computed by the 
algorithm represents the restriction of the linear transformation A to the subspace K m , with 
respect to this basis, i.e., we have H m — V^AV mf where V m = [vj, t> 2 , . • • , Wm]- Approxima- 
tions to some of the eigenvalues of A can be obtained from the eigenvalues of H m . This is 
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Arnoldi’s method in its simplest form. In practice m can be fixed and the algorithm can 
be used iteratively, i.e., it is restarted with t>i equal to the last approximate eigenvector 
associated with the desired eigenvalue, until convergence is achieved. This is for the case 
where only one eigenvalue/ eigenvector pair must be computed. In case several such pairs 
must be computed, there are two possible options. The first, suggested in [34] is to take 
t>i to be a linear combination of the approximate eigenvectors. For example, if we need to 
compute the p rightmost eigenvectors, we may take th = £?-i p^u,, where the eigenvalues are 
numbered in decreasing order of real parts. The vector v\ is then obtained from normalizing 
v\. The simplest choice for the coefficients piV is to take p,- = 1, * = 1, ..p. There are several 
drawbacks to this approach, the most important of which being that there is no easy way 
of choosing the coefficients p, in a systematic manner. The result is that for hard problems, 
convergence is difficult to achieve. 

An alternative is to compute one eigenpair at a time and use deflation. One can use 
deflation on the matrix A explicitly as was described earlier. Another implementation, 
which we now describe, is to work with a single basis t>i, uj, ..., v m whose first vectors are the 
Schur vectors that have already computed. Suppose that k — 1 such vectors have converged 
and call them Ui, t> 2) ffc-i- Then we choose a vector v* which is orthogonal to Vi, ...., vjt_i 
and of norm 1. Next we perform m — k steps of an Arnoldi process in which orthogonality 
of the vector Vj against all previous v[s, including Ui, ..., Vk-\ is enforced. This generates an 
orthogonal basis of the subspace 

span{vi,...,vk-i,v k , Av*, . . . , (27) 

Thus, the dimension of this modified Krylov subspace is constant and equal to m in general. 
A sketch of this implicit deflation procedure applied to Arnoldi’s method is the following. 
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Algorithm: Arnoldi’s method with implicit deflation 

A. Initialize: 

Choose an initial vector i>i of norm unity. 

B. Iterate on eigenvalues: Do k = 1,2, ...,p 

1 . Arnoldi Iteration. For j — k, le + 1 , ..., m do: 

• Compute w := Avj 

• Compute a set of j coefficients h,j so that to := w — 23«*i hijVi is orthogonal 
to all previous v,’s, i = 1,2, ...,j 

• Compute h j+lij = |H| 2 and v j+x = w/h j+ i tj . 

2. Compute approximate eigenvector of A associated the eigenvalue A* as well as its 
associated residual norm estimate p *. 

3. Orthonormalize this eigenvector against all previous Vj's to get the approximate 
Schur vector Uk and define Vk := u*. 

4. If p-j is small enough then accept as the next Schur vector, and compute A,-,* = 
(Avi,, Vi) i = 1 Go to EndXoop_B. Else goto B.l. 

EndXoop_B 


Note that in the B-loop, the Schur vectors associated with the eigenvalues Ai, ..., A*_i are 
frozen and so is the corresponding upper triangular matrix corresponding to these vectors. 
As a new Schur vector has converged, step B.4 computes the k- th column of R associated 
with this new basis vector. In the subsequent steps, the approximate eigenvalues are the 
eigenvalues of the m x m Hessenberg matrix H m defined in the algorithm and whose k x k 
principal submatrix is upper triangular For example when m = 6 and after the second Schur 
vector, k = 2, has converged, the matrix H m will have the form, 


H„ 


* * * * * ^ 

* * * * * 

♦ * * * 

* * * * 

* * * 

* */ 


(28) 


Therefore in the subsequent steps, we will consider only the eigenvalues that are not associ- 
ated with the 2x2 upper triangular matrix. 


13 


3 Shift and invert strategies. 

One of the most effective techniques for solving large eigenvalue problems is to iterate with 
the shifted and inverted operator, 

(A — <r/) -1 (29) 

for standard problems and with (for example) 

( K - oMY x M (30) 

for a generalized problem of the form Kx = \Mx. Strategies for adaptively choosing new 
shifts and deciding when to shift and factor ( K — aM) are usually referred to as Shift-and- 
Invert strategies. Thus, Shift-and-Invert simply consists of transforming the original problem 
( A — A I)x = 0 into (A — al)~ x x = fix. The transformed eigenvalues fii are usually far better 
separated than the original ones which results in better convergence in the projection type 
algorithms. However, there is a trade-off when using Shift-and-Invert, because the original 
matrix by vector multiplication which is usually inexpensive, is now replaced by the more 
complex solution of a linear system at every step. When a new shift <r is selected, the 
LU factorization of the matrix (A — crl) is performed and subsequently, at every step of 
Arnoldi’s algorithm (or any other projection algorithm), an upper and a lower triangular 
system are solved. Moreover, the cost of the initial factorization can be quite high and in the 
course of an eigenvalue calculation, one needs to use several shifts, i.e. several factorizations. 
Despite these additional costs Shift-and-Invert is an extremely useful technique especially 
for generalized eigenvalue problems. 

If the shift a is suitably chosen the matrix C — (A — a /) _1 , will have a spectrum with 
much better separation properties than the original matrix A and therefore should require 
far less iterations to converge. Thus, the rationale behind shift and invert technique is that 
factoring the matrix (A — <rl) once, or a few times during a whole run in which a is changed 
a few times, is a price worth paying because the number of iterations required with C is so 
much smaller than that required with A that the expense of the factorization is paid off. For 
the symmetric generalized eigenvalue problem Mx = XKx there are compelling reasons for 
employing a Shift-and-Invert technique. These reasons are discussed at length in [21], [23], 
[28] and [39], the most important one being that since we must factor one of the matrices K 
or M in any case, there is little incentive in not factoring (K — <tM) instead, to gain faster 
convergence. Shift and invert has now become a fairly standard tool in structural analysis 
because of the predominance of generalized eigenvalue problems in this applications area. 

For nonsymmetric eigenvalue problems, much remains to be done to derive efficient Shift- 
and-Invert strategies. Parlett and Saad [24] have examined different ways of dealing with 
the situation where the matrices M and K are real and banded but the shift a is complex. 
One such possibility is to replace the complex operator (K — <tM)~ x M by the real one 

Re[(K - aM)~ x M) (31) 

whose eigenvectors are the same as those of the original problem and whose eigenvalues fi{ 
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1 


(32) 


are related to the eigenvalues A, of (Af, K) by 


1 . 1 

Mi = -[- + 


2A,— <7j A, - <r,- 




One clear advantage of using (31) in place of (30) is that the latter operator is real and 
therefore all the work done in the projection method can be performed in real arithmetic. A 
nonnegligible additional benefit is that the complex conjugate pairs of eigenvalues of original 
problem are also approximated by complex conjugate pairs thus removing some potential 
difficulties in distinguishing these pairs when they are very close. On the practical side, the 
matrix (K — <tM) must be factored into the product LU of a lower triangular matrix L and 
an upper triangular matrix U. Then every time the vector w = Re[(K — aM)~ 1 M]v must 
be computed, the forward and backward solves are processed in the usual way and then the 
real part of the resulting vector is taken to yield w. 

Let us now consider the implementation of Shift-and-Invert with an algorithm such as 
Amoldi’s method. Assume that the problem is to compute the p eigenvalues closest to a shift 
<To- In the symmetric case there is an important tool that is used to determine which of the 
approximate eigenvalues should be considered in order to be able to compute all the desired 
eigenvalues in a given interval only once. This tool is Sylvester’s inertia theorem which gives 
the number of eigenvalues to the right and left of a by counting the number of negative 
entries in the diagonal elements of the U part of the LU factorization of the shifted matrix. 
In the nonsymmetric case a similar tool does not exist. In order to avoid the difficulty we 
exploit deflation in the following manner. As soon as an approximate eigenvalue has been 
declared satisfactory we proceed to a deflation process with the corresponding Schur vector. 
The next run of Arnoldi’s method will attempt to compute some other eigenvalue close to <t 0 . 
With proper implementation, the next eigenvalue will usually be the next closest eigenvalue 
to <7 0 . However, there is no guarantee for this and there is no guarantee that an eigenvalue 
will not be missed. This is a weakness of projection methods in the nonsymmetric case, in 
general. 

Our experimental code ARNINV based on this approach implements a simple strategy 
which requires two parameters from the user and proceeds as follows. The code 

starts by using cr 0 as an initial shift and calls Amoldi’s algorithm with (A — <r 0 7)” 1 Arnoldi 
to compute the eigenvalue of A closest to <7o. Amoldi’s method is used with restarting, i.e., 
if an eigenvalue fails to converge after the Arnoldi loop we rerun Amoldi’s algorithm with 
the initial vector replaced by the eigenvalue associated with the eigenvalue closest to <Jq. The 
strategy for changing the shift is dictated by the second parameter m 2 . If after m 2 calls to 
Arnoldi with the shift ctq the eigenpair has not yet converged then the shift is changed to 
the best possible eigenvalue close to.<7 0 and we repeat the process. As soon as the eigenvalue 
has converged we deflate it using Schur deflation as described in the previous section. The 
algorithm can be summarized as follows. 
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Algorithm: Shift-and-invert Arnoldi’s method with implicit deflation 

A. Initialize: 

Choose an initial vector Vi of norm unity, an initial shift <r, the dimension mi of the 
Krylov subspaces to be used, and the number m 2 of calls to Arnoldi before reshifting. 

B. Eigenvalue loop: Do k = 1, 2, ..., p 

1. Compute the LU factorization of (A — ol). 

2. If k > 1 then (re)-compute {h,j = ((A — <jI)~ x Vj, u«)}t,j=i,Jfe-i- 

3. Arnoldi Iteration. For j = k, k + 1, ..., m do: 

• Compute w := (A — <rI)~ x Vj 

• Compute a set of j coefficients so that w := w — J2i=i hijVi is orthogonal 
to all previous u.’s, i = 1,2, ..., j 

• Compute h j+hj := ||u>|| 2 and v j+1 := w/h j+hj . 

4. Compute eigenvalues of H m of largest modulus and get corresponding approxi- 
mate eigenvector of (A — <rl)~ x and an estimated error />* on the eigenvalue. 

5. Orthonormalize this eigenvector against all previous u/s to get the approximate 
Schur vector u* and define v* := u*; 

6. If pic is small enough then accept v* as the next Schur vector. Go to End_Loop_B. 

7. If the number of restarts with the same shift exceeds m 2 select a new shift and 
goto 1. Else restart Arnoldi’s algorithm, i.e., goto 3. 

EndJjOopJ}. 

A point of detail in the algorithm is that the kxk principal submatrix of the Hessenberg 
matrix H m is recomputed whenever the shift changes. The reason is that this submatrix 
represents the matrix ( A— <rl)~ x in the first k Schur vectors and therefore it must be updated 
as a changes. This is in contrast with the simple Arnoldi procedure with deflation described 
earlier. The above algorithm is described for general complex matrix and there is not attempt 
in it to avoid complex arithmetic in case the original matrix is real. In this situation, we 
must replace (A - <rI)~ x Vj in B.2 by Re[(A - oI)~ x vj] and make sure that we select the 
eigenvalues corresponding to the eigenvalues of A closest to a. We also need to replace the 

occurrences of eigenvectors by the pair of real parts and imaginary parts of the eigenvectors. 

i 

i 

4 Polynomial Preconditioned Arnoldi Algorithm 

There are various ways of preconditioning a linear linear system Ax = b prior to solving it 
by a Krylov subspace method. Preconditioning consists in transforming the original linear 
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system into one which requires fewer iterations with a given Krylov subspace method, without 
increasing the cost of each iteration too much. For eigenvalue problems similar methods 
have not been given much attention although the Shift-and-Invert technique can be viewed 
as a means of preconditioning. Moreover, Davidson’s method is nothing but a form of 
preconditioned Lanczos algorithm, where the preconditioning matrix is a diagonal matrix 
which varies at every step. This idea has been exploited by Morgan and Scott [20] who 
propose a generalization of Davidson’s algorithm based on more general preconditioners. 
Scott [38] also propose a preconditioned Lanczos procedure for the generalized eigenvalue 
problem. These techniques involve approximating the inverse of (A — a I) by a matrix of 
the form ( M — where M can be some approximation of A. Note that in the ideal 

case where a is an exact eigenvalue of A, and as ( M — cl)~ x is usually nonsingular then the 
rest of the eigenvalues will tend to cluster around one. This good separation will make the 
algorithm deliver the eigenvalue closest to <7 very quickly. These alternatives to Shift-and- 
Invert might be useful in the case where factoring the matrix (A — a I) is out of the question 
because of the size of the problem. 

For a classical eigenvalue problem, one alternative is to use polynomial preconditioning 
as is described next. The idea of polynomial preconditioning is to replace the operator B by 
a matrix of the form, 

Br = Pr{A) (33) 

where p r is a degree r polynomial. Ruhe [27] considers a more general method in which p r is 
not restricted to being a polynomial but can be a rational function. When an Amoldi type 
method is applied to B r , we do not need to form B r explicitly, since all we will ever need in 
order to multiply a vector x by the matrix B r is k matrix- vector products with the original 
matrix A and some linear combinations. 

Instead of attempting to compute several eigenvalues of A at once as was suggested in 
[34,29,32] the method proposed here is to compute only one eigenvalue at a time or possibly 
a pair of complex conjugate eigenvalues at a time. Deflation is then used to compute the 
next desired eigenvalues and eigenvectors until satisfied. The goal is to improve robustness, 
sometimes perhaps at the expense of efficiency. The preconditioning methods rest on the 
idea that all the difficulties in Arnoldi type methods, come from the poor separation of the 
desired eigenvalues. The real problem is that often the desired eigenvalues are clustered 
while the non wanted ones are well separated, which results in the method being unable to 
retrieve any element of the cluster and leads to very poor performance, often divergence. 

For fast convergence, we would ideally like that the next wanted eigenvalue of A be 
transformed by p r into an eigenvalue of B r that is very large as compared with its remaining 
eigenvalues. There are many ways of providing a satisfactory solution to this problem. The 
one considered here is derived from [32]. First we impose the constraint 


Mh) = 1 (34) 

Thus, we can attempt to minimize some norm of p r in some region subject to the con- 
straint (34). We can choose the norm of the polynomials, to be either the infinity norm 
or the Z^-norm. Because it appears that the Z 2 *norm offers more flexibility and performs 
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usually slightly better than the infinity norm, we will only consider a technique based on 
the least squares approach. We should emphasize, however, that a similar technique using 
Chebyshev polynomials can easily be developed. The procedure for computing the least 
squares polynomials has been described in detail elsewhere and we will refer the reader to 
the articles [32,31]. 

Once the polynomial p r is calculated the preconditioned Arnoldi process consists in using 
Arnoldi’s method with the matrix A replaced by B r = p r (A). This will provide us with ap- 
proximations to the eigenvalues of B r which are related to those of A by A i(B r ) = p r (Aj(A)) 
It is clear that the approximate eigenvalues of A can be obtained from the computed eigen- 
values of B r by solving a polynomial equation. However, the process is complicated by the 
fact that there are k roots of that equation for each value Xi(B r ) that are candidates for rep- 
resenting one eigenvalue A, (A). The difficulty is by no means unsurmountable but we have 
preferred a more expensive but simpler alternative based on the fact that the eigenvectors of 
A and B r are identical. At the end of the Arnoldi process we obtain an orthonormal basis V m 
which contains all the approximations to these eigenvectors. A simple idea is to perform a 
Galerkin process for A onto Span[V m ] by explicitly computing the matrix A m = Vj^AV m and 
its eigenvalues and eigenvectors. Then the approximate eigenvalues of A are the eigenvalues 
of A m and the approximate eigenvectors are given by V m y< where is an eigenvector of 
A m associated with the eigenvalue A <. A sketch of the algorithm is as follows. 
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Polynomial Preconditioned Arnoldi with Deflation 


A. Start: Choose the degree r of the polynomial p r , the dimension m of the Arnoldi 
subspaces and an initial vector V\. 

B. Initial Arnoldi Step: Using the initial vector vj, perform m steps of the Arnoldi method 

with the matrix A. 

C. Eigenvalue loop. Do k = 1,2, ...,p : 

1. Projection Step: 

• Obtain the matrix A m — V*AV m and its m eigenvalues {Ai,...A m } and 
eigenvectors 

• Compute the approximate eigenvector V m y*, and orthogonalize it against all 
previous u,’s to get the next approximate Schur vector u*. Get the corre- 
sponding residual norms p*. 

• If pk is small enough then goto End Joop_C. 

• Adapt: From the previous convex hull and the set {A* +J , . . . , A m } construct 
a new convex hull of the unwanted eigenvalues. 

• Compute the new least squares polynomial p r of degree r. 

2. Arnoldi iteration: 

Starting with v* = «*, generate m — k vectors by Arnoldi’s method applied to 
the matrix B r = p r (A). The result is a set of m orthonormal vectors V m = 
[t>l,U 2 , •••,%]. Go to 1. 

EndJoop.C 

When passing from step 2 to step 3, it is not necessary to actually compute the matrix 
A m which is available in step 2 as the Arnoldi matrix H m . We have skipped some details 
concerning the deflation process because of the resemblance with the process of the Shift- 
and-invert algorithm described earlier. 

5 Numerical experiments 

All numerical tests have been performed on an Alliant FX-4, using double precision, i.e., 
the unit roundoff is 2 -56 « 1.3877 x 10“ 17 . Our test example, taken from [26], models 
concentration waves in reaction and transport interaction of some chemical solutions in a 
tubular reactor. The concentrations x(r, z), y(r, z) of two reacting and diffusing components, 
where 0 < z < 1 represents a coordinate along the tube, and r is the time, are modeled by 
the system: [26]: 


dx D x d 2 x 


dr L 2 dz 2 


r sv") y)i 


with the initial condition 

z(0, z) = x 0 (z), y( 0, z) = yo(z), V z 6 
and the Dirichlet boundary conditions: 


[ 0 , 1 ], 


x(0, r) = x(l,r) = x 


y(0,r) = y(l,r) = y. 

The linear stability of the above system is traditionally studied around the steady state 
solution obtained by setting the partial derivatives of x and y with respect to time to be 
zero. More precisely, the stability of the system is the same as that of the Jacobian of (5) - 
(5) evaluated at the steady state solution. In many problems one is primarily interested in 
the existence of limit cycles, or equivalently the existence of periodic solutions to (5), (5). 
This translates into the problem of determining whether the Jacobian of (5), (5) evaluated 
at the steady state solution admits a pair of purely imaginary eigenvalues. 

We consider in particular the so-called Brusselator wave model [26] in which 


f(x , y) = A - (B + l)x + x 2 y 


g(x, y) = Bx - x 2 y. 

Then, the above system admits the trivial stationary solution x = A, y = B/ A. A stable 
periodic solution to the system exists if the eigenvalues of largest real parts of the Jacobian 
of the right hand side of (5), (5) is exactly zero. For the purpose of verifying this fact 
numerically, one first needs to discretize the equations with respect to the variable z and 
compute the eigenvalues with largest real parts of the resulting discrete Jacobian. 

For this example, the exact eigenvalues are known and the problem is analytically solv- 
able. The article [26] considers the following set of parameters 


D x = 0.008, D y = ]-D x = 0.004, 

A 

A = 2, B = 5.45 

The bifurcation parameter is L. For small L the Jacobian has only eigenvalues with negative 
real parts. At L w 0.51302 a purely imaginary eigenvalue appears. Our tests verify this fact. 

Let us discretize the interval [0, 1] using n + 1 points, and define the mesh size h = 1/n. 
The discrete vector is of the form where x and y are n-dimensional vectors. Denoting 
by fh and the corresponding discretized functions / and g, the Jacobian is a 2 x 2 block 
matrix in which the diagonal blocks (1,1) and (2, 2) are the matrices 




dfhj^y) 

dx 



and 


respectively, while the blocks (1, 2) and (2, 1) are 

dfhj^y) dgh(s, y) 

dy dx 

respectively. Note that since the two functions / and g do not depend on the variable z, the 
Jacobians of either fk or gn with respect to either x or y are scaled identity matrices. We 
denote by A the resulting 2n x 2n Jacobian matrix. We point out that the exact eigenvalues 
of A are readily computable, since there exists a quadratic relation between the eigenvalues 
of the matrix A and those of the classical difference matrix Tridiag{ 1,-2, 1}. 

We will refer to the shift-and-invert Arnoldi algorithm described in Section 3 as ARNINV 
and to the polynomial preconditioned Arnoldi method of Section 4 as ARNLS. It is difficult 
to select a suitable stopping criterion for nonsymmetric eigenvalue problems. In our case we 
have adopted to stop as soon as the residual norm is smaller than some tolerance e. However, 
the matrices may be scaled differently and we decided to scale the residual norms by the 
average singular value of the Hessenberg matrix produced by the projection process. More 
precisely, at every step we compute the square of the Frobenius norm f m = Trace 
and take as an estimate of the error of the computed pair eigen value/ eigenvector the number 

P 

where p is the computed residual norm provided by the method. Note that the denominator 
represents the square root of the average of the squares of the singular values of H m . In 
ARNLS the same scaling is used except that for the projection step (Step 4 of Algorithm 
ARNLS), H m is replaced by the matrix A m . Recall [34] that it is not necessary to compute 
the eigenvectors explicitly in Arnoldi in order to get the residual norms because these are 
equal to the products of h m+1>m by the last component of the corresponding normalized 
eigenvectors of the matrix H m . 

We used a discretization of n = 100 subintervals, i.e., the size of the resulting matrix 
is 200. We tested ARNINV to compute the six rightmost eigenvalues of A. We took as 
initial shift the value a = 0, and m\ = 15, m 2 = 10. In this case ARNINV delivered all the 
desired eigenvalues by making four calls to the Arnoldi subroutine and there was no need for 
chaning shifts. The tolerance imposed was e = 10~ 7 . The result of the execution is shown in 
Figure 1. What is shown in the figure is the progress of the algorithm after each projection 
(Arnoldi) step. The headings indicate the number of the eigenvalue that is being computed. 
Thus, when Arnoldi is trying to compute the eigenvalue number 3, it has already computed 
the first two (in this case a complex conjugate pair), and has deflated them. We print the 
eigenvalue of interest, i.e., the one we are trying to compute, plus the one that is likely to 
converge after it. The last column shows the actual residual norm achieved. 

We rerun the above test with the initial shift <7, namely <Tq — — 0.5 + 0. 2i and we changed 
m 2 to m 2 — 3. Initially, the run looked similar to the previous one. A pair of complex 
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conjugate eigenvalues were found in the first Arnoldi iteration, then another pair in the 
second iteration, then none in the third iteration and one pair in the fourth iteration. It 
took two more iterations to get the eigenvalues number 7 and 8. For the last eigenvalue a 
new shift was taken because it took three Arnoldi iterations without success. However the 
next shift that was taken was already an excellent approximation and the next eigenvalue 
was computed in the next iteration. The cost was much higher than the previous run with 
the cpu time climbing to approximately 5.65 seconds. 

We now illustrate the use of ARNLS on the above example. We fixed the dimension of 
the Krylov subspace to be always equal to m = 15. The degree of the polynomial was taken 
to be 20. However, note that the program has the capability to lower the degree by as much 
as is required to ensure a well conditioned Gram matrix in the least squares polynomial 
problem. This did not happen in this run however, i.e. the degree was always 20. Again, 
ARNLS was asked to compute the six rightmost eigenvalues. The run was much longer so 
its history cannot be reproduced here. Here are however a few statistics. 

• Total number of matrix by vector multiplications for the run = 2053; 

• Number of calls to the projection subroutines = 9; 

• Total cpu time used = 3.88 sec. 

Note that the number of projection steps is more than twice that required for Shift-and- 
invert. The execution time is also more than 80 % higher. We rerun the same program by 
changing only two parameters: m was increased to m = 20 and the degree of the polynomial 
was set to r = 15. The statistics are now as follows: 

• Total number of matrix by vector multiplications for the run = 1144; 

• Number of calls to the projection subroutines = 5; 

• Total cpu time used = 2.47 sec. 

Both the number of projection steps and the execution times have been drastically re- 
duced and have come closer to those obtained with shift-and- invert. One of the disadvantages 
of polynomial preconditionings is precisely this wide variation in performance depending on 
the choice of the parameters. To some extent there is a similar dependence of the per- 
formance of ARNINV on the initial shift, although in practice a good initial shift is often 
known. A superior feature of shift and invert is that it allows to compute eigenvalues in- 
side the spectrum. Polynomial preconditioning can be generalized to this case but does not 
perform too well in general. We should also comment on the usefulness of using polynomial 
preconditioning in general. One often heard argument against polynomial preconditioning is 
that is it suboptimal: in the symmetric case the conjugate gradient and the Lanczos meth- 
ods are optimal polynomial processes in that they provide the best possible approximation, 
in some sense, to the original problem from Krylov subspaces. Hence the argument that 
polynomial preconditioning would not do as well if one counts the total number of matrix by 
vector multiplications. The counter argument in the nonsymmetric case is a simple one: the 
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computing eigenvalue number 1 


re(lambda) 



im(lambda) 

* 

res. norm 

* 

0 . 1807540453D-04 

+ 

i. 

0 .21394975480+01 

* 

0.212D-09 

* 

0 . 1807540453D-04 

+ 

i. 

-0. 21394975480+01 

* 

0.212D-09 

* 

-0 . 6747097569D+00 

+ 

i. 

0.25285599180+01 

* 

0.224D-06 

♦ 

-0.6747097569D+00 

+ 

i. 

-0.25285599180+01 

♦ 

0 . 224D-06 

* 


computing eigenvalue number 3 


re (lambda) 



im(lambda) 

♦ 

res. norm 

* 

-0.67470975690+00 

+ 

i. 

0 . 2528559918D+01 

* 

0.479D-13 

* 

-0 . 6747097569D+00 

+ 

i. 

-0.25285599180+01 

♦ 

0.4790-13 

♦ 

-0 . 2780085122D+01 


i. 

0 . 2960250300D+01 

* 

0.3360-01 

♦ 

-0.27800851220+01 

+ 

i. 

-0.29602503000+01 

♦ 

0.3360-01 

* 


computing eigenvalue number 5 


re (lambda) 


im (lambda) 

* 

res. norm 

* 

-0 . 1798530837D+01 

+ i. 

0.30321646440+01 

* 

0 . 190D-06 

* 

-0.17985308370+01 

+ i. 

-0.3032164644D+01 

* 

0.1900-06 

* 


computing eigenvalue number 5 


re (lambda) 

im(lambda) 

* res. norm * 

-0 . 1798530837D+01 + i. 

0.30321646440+01 

* 0.1020-11 * 


-0 . 1798530837D+01 + i. -0 .30321 64644D+01 * 0.102D-11 * 

-0 . 2119505960D+02 + i. 0. 1025421954D+00 * 0.749D-03 * 


average error ■ 0.6820322E-14 

total execution time >2.13 sec. 


Figure 1: Convergence history of ARNINV for first test. Each separate output corresponds 
to a call to Arnoldi’s module 
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optimality result is no longer true. For linear systems, algorithms such as GMRES or GCR, 
are known to be optimal in the sense that they obtain the solution with smallest residual 
norm in the Krylov subspace [36,35]. However, this optimality is only valid for the highly 
impractical case where a full orthogonalization process with no restarting or truncation is 
undertaken at every step. In fact even in the symmetric case the optimality result is only 
true in exact arithmetic, which is far from the real situation where loss of orthogonality is a 
rather severe and damaging phenomenon. 

The next question is whether a simple restarted Arnoldi algorithm would perform better 
than a polynomial preconditioned method. The answer is a definite no. A run with ARNIT 
[33] an iterative Arnoldi method with deflation failed even to deliver the first eigenvalue of 
the test matrix used in the above example. The initial vector was the same and we tried two 
cases m = 15, which did not show any sign of convergence and m = 20 which might have 
eventually converged but was extremely slow. 

6 Summary and Conclusion 

We have presented essentially two methods for computing a few eigenvalues and the cor- 
responding eigenvectors or Schur vectors of large nonsymmetric matrices. Both techniques 
rely heavily on a Schur Wielandt deflation procedure and preconditioning. The first one uses 
Shift-and-Invert preconditioning and the second a form of polynomial preconditioning. These 
methods are of interest only when the number of eigenvalues to be computed is relatively 
small, such as when dealing with the stability analysis in nonlinear differential equations, 
or in the analysis of various bifurcation phenomena. Our analysis of the appendix and our 
experiments indicates that the Schur-Wielandt deflation is safe to use in general. There is 
an a-posteriori upper bound (see appendix) which can be used in practice to estimate the 
accuracy of the computed basis of the invariant subspace. 

Although not mentioned before, the deflation technique can also be of great help when 
dealing with the generalized eigenvalue problem. If one uses an Arnoldi or a nonsymmetric 
Lanczos method, big savings in computational cost can be achieved with deflation because 
it allows one to compute more eigenvalues with the same shift and so fewer expensive factor- 
izations must be performed. In essence the selective orthogonalization technique developed 
by Parlett and Scott [23,40] realizes a similar deflation technique in the symmetric case but 
does so in a more economical way. 
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7 APPENDIX: Error Analysis of Schur Wielandt de- 
flation 

In this appendix, we propose a few a posteriori error bounds in order to analyse the stability 
of the deflation technique. Typically, at each step j = 1,2,. . .p of the deflation process we 
compute an approximate eigenvalue A ; and an associated normalized eigenvector yj of the 
matrix Aj-\ = A — As a convention we define A 0 to be the matrix A. The 

approximate eigenpair satisfies the relation 

Aj-iyj = \jVj + rjj, j = 1, . . . p (35) 

where the residual vector rjj is some vector of small norm and is assumed to include both 
the effects of approximation and rounding. It is assumed that the matrix U p is orthonormal 
to working precision. Our purpose is to provide some information on the accuracy of the 
Schur basis U p and possibly of the eigenvalues obtained from the approximate eigenvalues 

Ajj, j — 1, . . . p. 

At step number j, the vector is orthogonalized against to obtain the 

j th approximate Schur vector Uj. This is realized by a Gram-Schmidt process and as a result 
the following relationship between the vectors u,- and yj holds: 

£ = Vi i = (36) 

i=i 

Denoting by bj the vector of p components ftj, ftj, . . . , ft,-, 0, 0, ... 0, the above relation can 
be rewritten as 

U p bj = yj . (37) 

Replacing this relation in (35), we have 

(A - = \,u,b, + n, (38) 

or 

AU p bj = Uj. x ^j. x Uf. x U p bj + A jU p bj + rjj. (39) 

Although there are only p — 1 shifts <r, used when p eigenvalues are computed, it is convenient 
to define <r p = 0 and 

£,,= Diag {&i, <72,..., <Tp-i t <7 p }. (40) 

Then (39) becomes 

AUpbj = U p [E p + (Aj - <Tj)I\ bj + rfc, j = l,2,...,p. (41) 

Let B p be the p x p upper triangular matrix having as its column vectors the 6-s, E p the 
N x p matrix having as its column vectors the p,-s and A p = Diag{\ x , A 2 , . . . A p }. Then the 
above relation translates into the matrix relation: 

AU P B P = U p [E P B P + B p (Ap - E p )j + E p , (42) 
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which we rewrite in a final form as 


AU r = U , [s p + B,{ A, - Sp)Bp-'] + E P B;'. 

(43) 

For convenience, we define 


z„ = EpB; 1 

(44) 

and 


Cp = Sp + Bp(Ap - Sp )B;\ 

(45) 


Observe that when <7< = A,, i = 1, . . . p — 1 then the matrix U p diagonalizes partially the 
matrix A if E p = 0. 

At the final stage of Algorithm ESWD, there are two ways of post processing before 
exiting. 

• Either one accepts the values A i,i = 1, . . .p as approximate eigenvalues and does not 
attempt to improve them. The representation of the section of A in the approximate 
invariant subspace U p is taken to be the matrix C p defined by (45). 

• Or one performs a final Galerkin projection onto the subspace spanned by U p in order 
to improve the current approximations. This is done by replacing the approximate 
eigenvalues A,-, i = 1, ... p by the eigenvalues of the matrix R p = Uff AU P . 

We will mainly focus our attention on the second approach, which is more attractive. 
In this case the Galerkin process involves some extra work, since the computation of the 
matrix R p itself costs us p 2 inner products. However, since p is small this is negligible as 
compared with the total work incurred during the whole computation. Note that R p is a 
full matrix with small lower triangular part, and one might still want the partial Schur form 
corresponding to the improved eigenvalues. This is easily done by computing the Schur 
factorization of the matrix R p , Rp = Q p S p Q p and then defining the new U p matrix by 
Up, new ~ U p Q p . 

Consider any N x(N — p) matrix W = [u?i, u> 2 , . . . w^-p] which complements the matrix 
U p into an orthonormal N x N matrix, i.e., so that the matrix [U p , W] is orthonormal. The 
matrix representation of the matrix A in this new basis is such that 

A[U„W\ = IU„W\{ W \ £’), ( 46 ) 

in which Xu = U^ AW, X 22 = W H AW, and Rp have been defined above. 

The above equation indicates that [U p , W] almost realizes a Schur factorization of A when 
Z p is small. In fact, the factorization can be rewritten in the following form: 

A-W" w )( w °z, o)W” w]H ~ [ u ” w Ko (47) 

When a Galerkin correction step is taken, then the approximate Schur factorization corre- 
sponds to taking U p as the basis of the eigenspace and R p as the representation of A in 
that subspace. As a consequence, in the approach using a correction step, equation (47) 
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establishes that the final result is equivalent to perturbing the initial matrix A by a matrix 
which is unitarily similar to the matrix 


{w°z p o)’ (48) 

Thus, the eigenvalues of R p will be good approximations of those of A if they are well 
conditioned, whenever the norm of W H Z P is small. The first case (no correction) can be 
treated in the same way and one can easily prove that the perturbation matrix is unitarily 
similar to „ 

(V?z, O', 

\w H z r o)' 

This analysis proves that the key factor for the stability of the deflation method is the way 
in which the norm of Z p increases. 

We now wish to provide a result which establishes an a-posteriori upper bound of the 
Frobenius norm of Z : as j increases. The column vectors Zj, j — 1, 2, ... p of Z p satisfy the 
relation: 

»7j = 2 Pij z i ( 50 ) 

«=i 

from which we derive the upper bound 



j - 1 


ftlMlslhll + EAM 

1=1 


(51) 


Using the Cauchy-Schwartz inequality for the last term on the right-hand side we get 



fi-1 1 

1/2 

3 - 1 

/ytall ^ ltall + 

- — 1 

W-1 

1 

• 

EINI S 

»=! 


(52) 


Since we have assumed that the eigenvector y;, which is orthogonalized against the previous 
u' t s, is of norm unity, an important observation is that the sum of the squares of the is one 
and /?jj represents simply the sine of the angle 9j between j ij and the subspace spanned by 
the vectors u,-, i = 1, . . - 1. Therefore, denoting by pi the Frobenius norm of the matrices 
Z„ i = 1, . . . p the above inequality reads 


sin(dj) 1 12,-|| < \\Vj\\ + cos($j)pj-i. (53) 

Adding the term sin(0j)pj - 1 to both sides and using the inequality (a 2 + b 2 ) 1 ^ 2 < a + b for 
the resulting left hand-side we obtain 

sin(6j)pj < \\r}j\\ + (sinOj + cos6j)pj- 1 , (54) 


which is restated in the following proposition. 
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Proposition 7.1 The Frobenius norms pj of the matrices Zj,j = l,...p satisfy the recur- 
rence relation 

Pj < (1 + cot Oj)pj-i + (55) 

where Oj is the acute angle between the eigenvector yj obtained at the j th deflation step and 
the previous approximate invariant subspace span{U j_i) and where rjj is its residual vector. 

It is important to note that since by definition sin 9j = /?yy all the quantities involved 
in the proposition are available during the computation and so the above recurrence is 
easily computable starting with the initial value po = 0. The result can be interpreted as 
follows: if the angle between the computed eigenvector and the previous invariant subspace 
is small at every step then the process may quickly become unstable. On the other hand 
if this is not the case then the process is quite safe, for small p. The interesting point 
is that the above recurrence can practically be used to determine whether or not there is 
such a risk of instability. The cause of the potential instability is even narrowed down to the 
orthogonalization process. If each newly computed vector yj were orthogonal to the previous 
ones then clearly B p would be the identity matrix and there would be no risk of amplification 
of errors. This opens up an interesting possibility. Assume that instead of computing an 
approximate eigenpair Ay, yj satisfying the relation (35) one is able by some hypothetical 
procedure to compute a Schur pair directly, i.e., a pair Ay, uy satifying the analogous relation 

i - 1 

Aj.xUj = Aytiy + £ 'TijUj + T)j. (56) 

1=1 

Then an analysis similar to the one used to establish (43) would easily lead to the relation 
AU P = UpR p + E p where R p is the upper triangular matrix having the diagonal elements 
A,, i = l,p and the off diagonal elements 7,-y, while E p is defined as before. Thus, in this case 
Zp is simply replaced by E v and the process is always stable. In a way, however, the difficulty 
is rejected to the hypothetical procedure that would compute the Schur pair. As an example, 
a naive algorithm for computing a Schur pair would be to compute the eigenpair and then 
orthogonalize the eigenvector yj against the previous u-s to get «y. By doing so a relation of 
the form (56) is always satisfied and rjy and its norm can be explicitly computed. If ||»7y|| is 
not sufficiently small one goes back to compute the eigenpair Ay, j/y to higher accuracy until 
||r/j|| is as small as wanted. The issue of whether there may exist other methods that deliver 
directly Schur vectors, is worth investigating. 

In this illustrative test taken from [33], we verify the error bound (55). The test matrix is 
the same as that of the numerical experiments section, but of size N=100, which corresponds 
to a discretization of n = 50 interior mesh points. We have computed the 10 rightmost 
eigenvalues and their associated Schur vectors by using an algorithm based on a polynomial 
accelerated Arnoldi method as described in [33] which is different from ARNLS. We used 
with m = 10, and polynomial of degree 100 = 5 x 20. Here, the stopping criterion for each 
eigenpair is that the actual residual norm be less than e = 10 -5 . In other words the norms of 
the vectors 77, as defined by (35) are less than e except for rounding in the actual computation 
of this residual which is negligible in view of the fact that t is large compared to the unit 
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m 



2 

4 

6 

8 

10 

0.2679108E-05 

0.8961249E-05 

0.1313459E-04 

0.1398945E-04 

0.1397979E-04 

0.1161542E-04 

0.1857656E-04 

0.3268575E-04 

0.6703071E-04 

0.6776894E-04 


Table 1: Comparison of the estimated Frobenius norms of the errors in the invariant sub- 
spaces with the actual norms. 

round-off. As soon as a new pair of complex conjugate eigenvalues converged, we computed 
the corresponding new Frobenius norm of Zj and the corresponding estimate given by (55). 
The results are shown in Table 1. The 10 rightmost eigenvalues are all complex and so they 
appear in pairs. In this example, in fact in all our tests conducted with this class of test 
matrices, there is a good agreement between the estimated norm and its actual value. 
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